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ABSTRACT 


The objective of this project is to devise * method to determine the position and 
orientation of the links of a PUMA 560 using fiducial marks, As a result, it is 
necessary to design fiducial marks and a corresponding feature extraction algorithm. 
The marks utilized are composites of three basic shapes, a circle, an equilateral 
triangle and a square. 

Once a mark is imaged it is thresholded and the borders of each shape are 
extracted. These borders are subsequently utilized in a feature extraction algorithm. 
Two feature extraction algorithms are utilized to determine which one produces the 
most reliable results. The first algorithm is based on moment invariants and the 
second algorithm is based on the discrete version of the ip - s curve of the boundary. 
The latter algorithm is clearly superior for this application. 
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CHAPTER 1 
INTRODUCTION 

1.1 Overview 

Computer vision is an essential part of any intelligent robotic system. It 
serves as a means to identify the environment and to verify the robot s location 
and orientation. This information provides the parameters to determine the proper 
action to be taken by the system. 

In the verification of the robot’s position and orientation, a means must be 
found to identify the position and oriental ion of each joint. The use of fiducial marks 
on each link seems to be an. effective and cost efficient way to accomplish this task. 
The marks provide a way of uniquely identifying each link and locating a point on 
the link. This point along with two calibrated CCD cameras can determine the 3 D 
location. 

There are several criteria the marks must satisfy to be effective tools. They 
must be simple enough so that they are identifiable even under conditions of low 
resolution, perspective distortion, rotations, scale changes, and translations. The 
above can be stated in another way. There must exist an algorithm that can extract 
features from the marks that are rotation, translation, and scale invariant, and 
the features should be as insensitive as possible to perspective distortion. For a 
discrete image it is not possible 10 have feat ures that are truly rotationally and scale 
invariant. In real] tv the features are a tuivi ion of the resolution of the digitizing grid. 
Features that satisfy the above constraint > are sufficient for the task of determining 
the status of the link. The position and orientation of a link can be modeled, in 
three dimensional space, as an ordered rot ation about the x, y, and z axis and a 
translation. This transformation, the link and its corresponding mark are subjected 
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to a projective transformation which is a noninvertible transformation of three- 
dimensional world space into a two-dimensional image plane. Therefore, every point 
in the image is a function of its positon in world coordinates, the focal length of 
the CCD camera, and the field of view of the camera. In addition, the image of 
the fiducial mark is a function of the distribution of the points around its centroid, 
the three rotation angles of the plane that the marks lie upon, the position of the 
centroid, the focal length and the field of view. 

Moments provide an excellent way to characterize mass distributions: such as 
horizontal and vertical centralness, diagonality, horizontal and vertical divergence, 
and horizontal and vertical imbalance. Another convenient feature of moments is 
their ability to be normalized for scale changes and rotations and translations in the 
image plane. Many of these invariants can be obtained by using either the theory of 
algebric invariants, introduced by Cayley, Hamilton, and Sylvester, or by requiring 
that certain lower order moments have a prescribed value, and normalizing the other 
moments with respect to these lower order moments. Another convenient feature of 
moments is their ease of calculation. 

1.2 Literature Survey 

Manv papers have been written on the use of moments in pattern recognition 
applications. One of the first is the paper written by Hu[l]. In this paper Hu 
discusses recognition of two-dimensional geometric patterns by using the classical 
theory of algebric invariants to derive moment invariants that are insensitive to 
scale, position, and orientation. This method uses invariant moments based upon 
uniquely determined principal axes and tin- method of absolute moment invariants. 
These moment invariants are subsequently stored in a feature vector and compared, 
using a minimum distance formulation, to feature vectors of known patterns. 
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Udagawa et alia[2] use moments to identify capital letters of the English al- 
phabet. Their method consists of normalizing linearly distorted patterns by setting 
certain conditions on the lower order moments. The method essentially normalizes 
for any distortion due to an affine transformation. The normalized moments are 
used as recognition features. 

Alt[3] uses moments to identify letters and numbers. He normalizes each pat- 
tern with respect to position, size, stretching and squeezing in the x or y directions 
and slanting in the x direction. The patterns are not rotationally invariant. The 
rotational variance is done to facilitate the discrimination of 6 s and 9 s. The slant 
invariance is utilized to identify italics and bold faced letters as the same pattern. 
Normalization is accomplished by utilizing the standard deviation and the regres- 
sion coefficient of x on y. Third through sixth ordei moments aie calculated for the 
twenty-six capital letters and nine numbers. The discrimination algorithm searches 
for gaps in the values of a particular moment. These gaps are discrimination points 
that separate certain patterns from others. Once subregions are formed based on 
these points, another moment is used to break the subregions into smaller regions. 
The process continues until each subregion consists of one element. 

Casey[4] deals with the problem of normalizing handprinted characters. Be- 
cause of the large disparity in handwriting styles, recognition of characters is a 
difficult task. He models the distortion as an affine transformation. This infor- 
mation is used to direct the direction ol >can of an optical character recognition 
device to obtain a more uniform scan of Inters. He uses the same methodology as 
Udagawa. 

Smith and W right [oj uses l lie method of moments to estimate the location, ori- 
entation, length, width, and heading of a ship. The estimates are obtained by taking 
moments of a ship photograph and using linear, quadratic, and cubic polynomial 
functions of the moments as estimators ol the ship descriptors. The best moments 
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for each polynomial are chosen using linear regression. This research verifies the 
feasibility of using moments to interpret ship photographs. 

Dudani, Breeding, and McGhee[6] address the problem of aircraft identifica- 
tion. The images of the airplanes are binearized and moment invariants are extracted 
from the image. They preprocess the two-dimensional binary image of the three- 
dimensional aircraft and extract a clean silhouette and its corresponding boundary. 
The algorithm employed is orientation invariant. The dimensionality of the feature 
vector is kept as low as possible and is shown to be invariant to translation in the 
plane normal to the optical axis. The moment invariants employed are the Hu in- 
variants divided by a power ol the radius of gyration. They calculate two sets of 
moments, one for the silhouette and one lor the boundary. The boundary moments 
are found to contain a large amount of information on the high frequency content of 
the image. To identify the images they created a recognizer that consisted of 3,000 
live images of six types of aircrafts. These samples are obtained by imaging each 
aircraft at various orientations. They then map the feature space to a space defined 
by the set of eigenvectors corresponding to the training sample covariance matrix. 
The set of feature components is ordered according to the information content. Two 
types of decision rules are employed to classify unknown images, Bayes decision rule 
and the distance- weighted k-nearest neighbor algorithm. The results of the algo- 
rithms are compared to the decisions made by human observers. Both algorithms 
outperform the human observers, but each computer decision took thirty seconds 
whereas the human observers take between ten and fifteen seconds. The algorithm 
achieves reasonable accuracy in estimating the aircrafts inclination. The errors are 
typically between five and ten degrees. 

Teague(7] addresses the issue of classifying and manipulating optical informa- 
tion by utilizing moments. He summarizes the properties of the lower order geomet- 
ric moments. The merits of Zen like moments are addressed in relation to rotational 
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invariance and optimal reconstruction ol an image. It is also shown that Zernike 
moments can be easily derived from the geometric moments. He demonstrates the 
advantage of using the Orthogonal moments in image reconstruction. 

Wong and Hall[8] use geometric moment invariants to match radar images 
to their corresponding optical images. Because the invariants are calculated for 
continuous images these moments are nor strictly invariant for digital images. The 
amount of discrepancy is a function of the amount of the scale, translation and 
rotation change. According to their data, reasonably good results can be obtained 
for rotations up to forty-five degrees and st ale changes of less than a factor of two. 
They designed a hierarchical search technique to match the radar to the optical 
scenes. This scheme consists ol extracting a structural set of images, both radar 
and optical, which are of decreasing size and resolution. The match sequence starts 
with the lower resolution images. A thresholding algorithm and decision rule is 
utilized to guide the search from a lower resolution level to a higher resolution level. 
The rules are selected to find the most promising locations at each level. Only 
these areas are tested at die next higher resolution. A product correlator is used to 
match the invariant moments ol the radar subimages to their corresponding optical 
subimages. 

Bovce and Hossack[9] construct team re vectors of arbitrary order while main- 
taining the significance of the higher order components of the feature vector. The 
features are Zernike moments and the rotational moments. Reconstruction of the 
images based upon a finite number ot Zermke moments is discussed. The invariants 
used in the feature vector are rotational moments. The transformation is invariant 
to scale, intensity, rotation, and translation. The goal is to create features that are 
independent and are of approximately equal orders to magnitude. This insures that 
the information content is not overly sensitive to noise. The rotational moments 
are used to identify the image and the Zernike moments are used to reconstruct the 
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image. 

Khotanzad and Hong[10] describe now rotationally invariant features using 
Zernike moments, and a systematic method to select the desired number of features. 
This is accomplished by evaluat ing the discrimination power of the information con- 
tent of the ith ordered features of different classes. The patterns are grouped into 
pairs. The pairs are subsequently rotationally aligned, and the Hamming distance 
of the information content of the pair is ta ken. A cumulative measure of the Ham- 
ming distance is obtained, and this is divided by the total number of pixels. This 
value is divided bv one more than the feat ure number to provide a measure of the 
discrimination power. When this discrimination power exceeds a preset threshold 
then the number of features needed is known for the pair. The maximum value of 
all the pairs is taken as the number ol features needed for the given patterns. 

1.3 Author’s contribution 

It is necessary to identify the position and orientation of the links of a robotic 
manipulator (PUMA 560). To accomplish this task several fiducial marks will be 
placed upon each link. What remains is to identify each fiducial mark (a pattern 
recognition problem) and to locate a poim associated with each link (the centroid). 
The first portion of the problem consists of designing an adequate number of simple 
fiducial marks. This is done to label the links sufficiently and to facilitate the 
extraction of recognizable features under conditions of low resolution and perspective 
or orthogonal projection. 

The marks employed are designed from simple shapes - such as circles, squares, 
and equilateral triangles. Since if is necessary to generate a large number of marks 
from these basic shapes, the idea ot nesting shapes within shapes is introduced. Each 
composite pattern is designed such that each interior shape is completely contained 
in its parent shape, and each interior shape has a grey level intensity value that 
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contrasts with its parent shape. Using this methodology and two level of nesting, it 
is possible to generate twenty seven unique fiducial marks. 

Since the twenty seven generated fiducial marks are composites of the three 
basic shapes, the extraction ot the borders of each of the shapes contained within 
the mark reduces the recognition problem from one of extracting the features from 
twenty seven unique patterns to that ot extracting features of three shapes. Once 
the borders of each of the interior shapes are identified, the results are combined to 
yield the correct identification of the mark. 

Moments are extracted from each of the borders to determine whether they 
are reliable features. Since the moments are relatively simple to calculate, it is 
of interest to determine if they can be used to identify the shapes in binary, low 
resolution, perspective distorted images. It is also of interest to determine whether 
the normalized moments ot orders two and three can be used to accomplish this 
recognition problem. 

Features based on the 0-s curves of the boundary are also used in this study. 
In particular, the measures ot curvature obtained from these functions were used 
as features. These curvature measures are essentially local feature descriptors, and 
therefore are more susceptible to noisy border extractions and quantization effects. 
It is of interest to determine how these feai ures perform under poor image conditions. 
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CHAPTER 2 

MATHEMATICAL BACKGROUND 

2.1 Rotation Matrices In Three-Dimensional Space 

In many vision and or robotics applications, it is convenient to represent ro- 
tations of bodies or points around arbitrary axes in a convenient matrix form. Fol- 
lowing [12], consider the derivation of the rotation matrix for rotations about the 
x-, y-, and z-axes (figure 2.1). 

A convenient wav to view this rotal ion is to consider two coordinate systems, 
XYZ and UVW centered at the origin and initially coincident. 

Consider now a point P in this three-dimensional space. It has a representation 
in each of the coordinate systems denoted by 


’ Pu 


' Pr 

Pu 

cm d 

Py 

. p - . 


P ; 


The point P is assumed to be rigidly attached to the UVW system. The goal is to 
find a rotation matrix that represents the rotation of the LVW coordinate system 
and the point P about, the X\7. coordinate system. 

The point P uvw can be represented as a linear combination of the basis vectors 
of the UVW system. 

Puvr = Pi hi -r- Pviv + Pw^w (2.1) 

To obtain the mapping ot P onto each ot the basis vectors of the X\ Z axes the dot 
product of P is taken with respect to the above basis vector. The results are as 
follows: 

P x = i r • P = i r • i„p, + i z • j„P, + U • kwPiM 
Py = j„ • P = jy • ill Pu + jy • j „P> + Jy * k w Pie 


( 2 . 2 ) 

(2.3) 
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P. = k; • P = k. • i U P U + k. • j U P V + k, • k W P W . (2.4) 

This can be expressed in matrix form as 



Keeping in mind that any rotation can b<- achieved by successive rotations about 
each of the three axes in the XVZ system, all that needs to be done is to obtain 
a matrix representation of rotations about each of the coordinate axes and then 
multiply the three matrices to obtain a composite rotation matrix. A rotation 
around the x axis by an angle o leaves flie i u axes fixed in relationship to the XYZ 
coordinate system (figure 2.2). Since tin* i u axis is coincident with the i x axes, 
and the j v and k, v are rotated by an angle a with respect to the j y and k* axes 
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respectively, the following rotation matrix is obtained 

^10 0 X 
R £i o = 0 cos a — sin a 

y 0 sin a cos a 

The same procedure is followed for rotations about the y-axis to obtain the following 
matrix: 


/ 


Ry.O — 


( cos o 0 siruf) 
U 1 0 


\ 


y sin 6 0 cos <j> J 

Rotation about the z-axis is represented by: 


^ cos 9 — sin 9 0 ^ 


R-b = 


sin 0 cos 9 0 

l 0 0 1/ 

A composite rotation matrix is obtained by multiplying the matrices together. 
Since matrix muplication is not commutative, the order of multiplication is impor- 
tant. For example, if one wanted to obtain the composite rotation matrix for a 
rotation about the z-axis by 9. followed by a rotation about the y-axis by <f>, and 
then a rotation about the x-axis by a. the composite rotation matrix would be: 
R = Rx.ciRy.oRz.9~ where R= 

/in n \ 


1 0 0 
0 cos n — sin a 
0 sin n cos o 


/ j. n a \ / — - a Q n \ 


cos 6 0 — sin 0 

0 l 0 

V 0 sin n cos o J sin o 0 cos 0 J ^ 0 0 1 ) 

To represent the rotated point P UI .„- m terms ot the X\ Z coordinate system, 
it is premultiplied by the composite rotat ion matrix. 


cos 9 — sin# 0 

sin 9 cos 9 0 

0 0 1 


/ 


\ 


r y 

\ * J 


= R 


( p \ 

1 U 

P v 

\ M 
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Figure 2.3: Pinhole Camera Model 
2.2 Perspective Transformation 

Following [13] the perspective projedive transformation for a camera is mod- 
eled by a pinhole camera. This model maps points from a three-dimensional world 
space into a two-dimensional image plane. It is initially assumed that the coordinate 
systems for the image points are coincident and centered in the image plane. This 
is shown in figure 2.3. 

From the above, it is readily observed that any imaged point lies on the plane 
connecting the object point to the center ot the projection. Using this observation, 
a relationship be! ween the imaged point and the object point is obtained: 
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kx, 

hi 

\ -*/ ) 


N / -x„ ' 


~’J0 

Solving for image points in terms ot object points, the following is obtained 

—Jo 


x ; = 


Vi = 


k ’ 

-l/o 

Jfc ' 


Solving for k, it is found that 


k = l 


/- 


-/ 

finally, solving for the image points in terms of object points and focal length, 
the following is obtained: 


Xi = 


Vi = 


/Jo 

/--o’ 
/.l/o 
./ — Jo 


2.3 Homogenous Coordinates 

The use of homogeneous coordinates is an extremely useful tool for dealing 
with coordinate transformations. They provide an efficient matrtix form for the 
representation of a combination ot perspective transformations, rotations, about the 
x, v, or z axis, scale changes, and translations. As a result the use of homogeneous 
coordinate extends to the field, ot computer graphics, robotics, and computer vision. 
What follows is a brief introduction to the topic. 

Homogeneous coordinates essentially transform a nxl vector into a (n+1) xl 
vector. This is accomplished by multiplying each of the n elements of the origanal 
vector bv a constant scale factor, denoted by w. These scaled quantities become the 
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first n elements of the new vector. The (n + L) position of the new vector is occupied 
by the scale factor. This concept can be clarified using the following example. Given 
a point in three dimensional cartesian space, denoted by the vector, 

( Y ) 

r , 

l 7 - 

The homogeneous representation of this point would be denoted by 

( V \ 

A/u- 

y/"- 

Zj ir 

\ w 7 

It is readily observed that a transformation from homogeneous coordinates 
back to the original vector space is accomplished by dividing the first n elements 
of the homogeneous coordinate vector. An example might prove useful in clarifying 
the transformation. Given a homogeneous coordinate vector 

( \ 

h 

r 

\ d ) 

The cartesian vector is represented as 

( - \ 

n 

!, 

'/ 

W / 

The above concept will prove to be extremely useful in the analysis of coordi- 
nate transformations. 
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2.4 Coordinate Transformations Using Homogeneous Coordinates 

A homogeneous transformation matrix is defined as an nxn matrix that maps 
an n dimensional homogeneous vector and transforms it into another homogeneous 
vector. In the case of a 4x4 transformation matrix, a 4x1 homogeneous vector 
in one coordinate system is mapped into a 4x1 homogeneous vector in another 
coordinate system. 

For the special case of three-dimensional vector manipulations, the homoge- 
neous transformation matrix can be subdivided into four distinct operations: ro- 
tation, translation, scaling, and perspective transformation. Combinations of these 
transformations can be obtained by multiplying the matrices of the component trans- 
formations. 

The rotational transformation can lie represented as 

' Rt\ R\> #13 0 

#21 Ru #23 0 

R = 

#31 #32 #33 0 

^ 0 0 0 1 

where R is a three dimensional composite rotation matrix. 

The translation transformation is defined as 

1 0 0 P x 
0 1 0 # a 
0 0 1 #_- 
0 0 0 1 

A scale transformation is represented as 

k t 0 0 0 

0 k„ 0 0 

0 0 A:- 0 

0 0 0 1 
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where k x , lc y , and k. are scalar factors in the x, y, and z directions respectively. 

For the case of a perspective transformation using a pin hole camera model 
and back projection, the homogeneous transformation matrix can be represented in 
two forms. 


( f 000 ^ 
0 / 0 0 
oo/o 
\-l 0 0 f J 


or 


1 

0 

0 


0 0 0 ^ 
1 0 0 
0 1 0 
0 0 1 J 


typically, the latter form is employed. 


2.5 Geometric Moments 


Moments have been utilized in a wide variety of applications ranging from 
aircraft to character identification. They are relatively simple to compute and can 
be made invariant to rotation, scale, and Translation. They are one of a general class 
of shape descriptors. In the presentation that follows the two-dimensional moments 
are analyzed. 

Given a piecewise continuous irradiauce function, denoted by f(x,y), the (p+q) 
ordered moments are 


m,;., = I c / dx dy, (2.7) 

P,<7 = 0 . 1,2 

It. should bo noted that- the moment sequence m pq is uniquely determined by 
the irradiance function f(x.y) given that f(x.y) has nonzero values in a finite portion 
of the plane. As a consequence, the function f(x,y) is uniquely determined by its 
moment sequence >n prr To have utility in pattern recognition, moments should be 
invariant to parallel translations, rotations in the plane normal to the optical axis, 
and scaling. 
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2.6 Central Moments 

It is possible to make the geometric moments invariant to parallel transla- 
tion. This property is obtained by transforming the geometric moments into central 
moments. These central moments are delined as 



Hpq — r r [x -x) p {ij - H) q f{x,y)d{x - x)d(y -y), 
J — X* j ^ 

(2.8) 

where 

X* = nixo/moQ. [j = n?oi/ m oci' 

(2.9) 

The translation invariance of the central moments is easily shown consider the map- 

ping 

xf = x + h 

(2.10) 


y> = y + k 

(2.11) 

which transforms the nonzero region A ot t(x,y), into A/. The central moments of 

At are 

= JJ (*' - •Z'Xy' - rj')f{x,y)d{x/ - x/)d{y/ - yf), 

(2.12) 

where 

.?/ = ni'i 0 /m/ 00 . y’ = rnfoi/m/ 00 . 

(2.13) 

Since 

m./ 01 = m 1)! -1- k, 

(2.14) 


mho = m, 0 + h, 

(2.15) 

and 

•VOou = .Voo, 

(2.16) 


substitution into 2.12 yields 2.8. 

The central moments can be represented in terms of the ordinary moments. 


/V, = P j* (x - :>■)' (!/ - j D q f(x,y)dxdy 


(2.17) 
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, I „\ 


TOO /*co 

= / / E 

• / ~=° 7 - x> .=o ^ / y 
Combining summations 


(2.18) 

J=0 , 


? p f p \ 


Hpq 


/•OO rx. _ 1 __ 

= EE 

7 - x ' i=o i=o ?: y 


M 

\ J 7 

Interchanging summations and integrals 


-x) p '{-y) q : x'y J ]f(x, y)dxdy (2.19) 


P <t ( p \ f s. \ 


i=0 j=0 y I j 


<1 

\ J I 


r OO /*OC 

{-xY~ i {-y) q ~ ] / / x'y J f(x,y)dxdy (2.20) 

— -XW —CO 


From equation 2.7 and equation 2.20 it is clear that 


f-f-M (i 

Vpi = 2^ Z. 

‘=oj=o \ i y i 


{-xy-'(-y) q ->M ir 


(2.21) 


2.7 Algebraic Invariants 

Algebraic invariants have surfaced in the works of Lagrange and were redis- 
covered in the works of Gauss, but neither of these men decided to develop their 
observations into a formal theory. It was not until Boole, Cayley, and Sylvester that 
the study of the theory of algebraic invariants flourished. 

Hu is usually credited with the application of algebraic invariant theory to the 
formulation of rotation and scale invariant functions of moments. The derivation 
that follows is credited to him [l]- 

Given a binary algebraic form ot u and v expressed as 


/ = £ 


\p-> 


\ 

/ 


(«(p-<).*K 


( 2 . 22 ) 


or using the Cayley notation as 


,/ = (<V : 




(2.23) 
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A homogeneous polynomial of the "a'’ coefficients is an algebraic invariant of 
weight w, if 

I{a/po, • ■ • , a'op) = A“7(a p o, - • • , a 0p ) (2.24) 

where a/ p o, • • • , a/o p are the coefficients obtained from substituting the following gen- 
eral affine transformation into the original algebraic binary form. 


o 7 

8 8 


u/ 

vt 


(2-25) 


and A is the determinant of the linear transformation 


A = a8 — 3~f ^ 0. 


(2.26) 


If the weight of the invariant is zero, il is an absolute invariant; otherwise it is 
a relative invariant. There exist, certain a I fine transformations that allow A to be 
something other than the determinant of the transformation. These transformations 
are useful in deriving the necessary moment invariants. It is also useful to introduce 
another pair of variables, x and y, and sub ject them to the transformation 

.17 

. y' 

Transformation 2.25 is referred to as the coiitragredient transformation, and trans- 
formation 2.27 is referred to as the cogredient transformation. The eight variables 
x, y, u, v, ,r/. yf, <//. and vt share the invariant, relationship 

in- 4- v ij = ii/.vf = v/y/. (2.28) 


n 0 


X 

7 8 _ 


. y . 


(2.27) 


To apply the theory of algebraic invariants to moments, it is necessary to define 
an algebraic binarv form which has as its < oefficients the moments of order p. One 
such function is the moment generating function which is defined as 

.U,,.,. = [ Y,^[u.r + vy) p f{x,y)dxdy. 

P =o P- 


— O 


(2.29) 
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Interchanging integration and summation produces 



x p t y l u p l v l ]f(x, y)dxdy. 


(2.30) 


Equation 2.30 is equivalent to 

£ - 7 K 0 - ■ ■ • ■ U 0 p)(u, v) p . (2.31) 

p=U P' 

By combining equations 2.25, 2.27, and 2.29 the following is obtained: 

/•CO r'C 00 1 I 

= / / T-(urr/+vry/yff(x/,y/)—dx/dy/ (2.32) 

where J is the jacobian of trails format ion 2.27, y /) is equal to f(x,y) and 

Mt{ut % vi) is the moment generating funct ion of the transformation. Since 

/ / (xf) l, {!j/) q f(x/,y/)dxfdyr, (2.33) 

J — ■ ->3 ./ -'30 

Mf(uf, vt ) = 777 £ 4^ "'p 0 ' " ' > u/ °p)( u/ > y/ ) P - (2.34) 

U I p=o P- 

Combining the results of 2.29. 2.30, 2.31, 2.32 and 2.34 it is shown that if the 
binary algebraic form of order |> has an algebraic invariant then the pth order mo- 
ments have the same invariant hut multiplied by the absolute value of the Jacobian 
of the cogredient transformation. In other words if 


7K,0, .... 0.1 Qp ) — I [dp o, . - ■ , aop). 

then 

7 ( o^po o/()p) — J .7 _X 7( UpQ, . . . , uq p ). 

Under the scale change denoted by 
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(2.35) 


(2.36) 


(2.37) 
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each coefficient ol the binary algebraic form is an invariant 


a? pq = o /,+e? a pg . 


(2.38) 


Therefore, the relative moment invariants are multiplied by the Jacobian of 2.37 
producing 


ftfpq — * q/? Mpq- 


(2.39) 


To obtain an absolute scale invariant, the value ot c* is obtained fiom the relationship 
between the zeroth order moments. 

/ '' 00 -. (2.40) 


Q = 


/' 00 


substituting 2.40 into 2.39 obtains: 

V'w 


Ppq 


, (2.41) 

(Mx,) (£ ^ +l) (/* oo) (V+1) 

What follows is a derivation of rotational invariance. For a rotational transformation 
the contragredient transformation is 


cos 0 sin 9 
— sin 0 cos 9 


and the cogredient- transformation is 

r -i r 

cos 0 — sin# 

sin/? cos 9 


u? 

vf 


(2.42) 
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Since the Jacobian ol the cogredient transformation is equal to one, the algebraic 
invariant is equivalent to the moment invariant. Therefore, treating the moments 
as the coefficient of the binary form 

(/fpo. /'■),.)( «, t’) p > (2-44) 

and using the following transformation 
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the following relations are obtained: 


(2.46) 


n = Ue~ l, \ V / = Ve' 9 (2.47) 

Substituting 2.47. 2.46, and 2.45 into 2.44, 

(/ P o Io„){L f ,V) p = {ft, »>,■■■, Vo p){u,v) (2.48) 

(V-.-Jo ipM^T = {f>f r0 ,...,ft/ 0p ){uf,v/Y (2.49) 

(/po Io P )(W = (I>i>0 > ...J'o P )(Ue-' 9 ,Ve-' 9 y (2.50) 

equating like terms in 2.49 obtains 

o = e‘ l9 fpo (2.51) 

//p-1.1 = e i(p - 2, '/p-u;...; (2.52) 

(2.53) 

7/»p = (2-54) 


From the identity of the first two expressions, it is clear that I p -r,r is the complex 
conjugate of I r , p -r and 

/p— t.t ~ 

[(/ipo; ftp--- Vp-2r,2r){ 1 • 1 ) ^ ^ ( /'p- 1 .1 i A £ p— 3,3i ■ ■ - ! Mp-2r- 1 ,2r+l )( 1 , l) 7 "; 

(/f2r.p-2r: /' >i'+2.p-2r-2-' • • - /'0p)( 1- 1 )' ]( 1. — 0 P ’ > 


where p - 2r > 0. and 
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^p-4,4 + . ••• i +^0; 
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(2.55) 


where p is even. 
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For rotation ancl reflection the cogredient transformation is 


xt 

<J> J 

cos 0 sin 0 x 

sin 9 — cos # y 

(2.56) 


= e~’ p9 I 0p 

(2.57) 

I tp -\ . 1 

= e~^ 9 h, p .u . . r, 

(2.58) 

I'l.p- l 

II 
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u 

1 

(2.59) 

/V 

= e ""'7, 0 . 

(2.60) 


From the above derivation Hu obtains six rotation invariants and one skew invariant. 
They are as follows: 

M - 20 + M 02 
( f-l 20 — I'Oi)’ + 

7*30 - 3^12 ) 2 + (3^21 - M 03) 2 
7<3o 4- ^ 12 ) 2 + 7*21 + ^oa) 2 


( /<30 - 3 /« i 2)(/'30 + /‘ 12 )[(^ 3 0 4 - /< 12) 2 ~ 37*21 + M 03 ) 2 ] + 

(3// 2 , - p<a){Pn + ^ 03 ) [3(^30 + M 12) 2 - 7*2i + ^03) 2 )] 

(/<20 - /<<»)[(/< 30 + /<12) 2 - ( 21 + /'o:-.) 2 ] + 4/*n[(/* 3 0 + Hu){V2l + /*<»)] 


(3/t 2 l ~ /' o:j)(/'30 + /'l2)[(/'30 + / / 1 2 ) 2 - 3(/i 2 l + ^03) 2 ] ~ 
(/«3U ~ 3 /.1 1 2 ) 7* 2 1 -T 03) [3 + Ml2) 2 ~ (/*21 + P03) 2 ]- 
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Figure 2.4: Square and corresponding 0 - s curve 
2.8 0 - s curves 

The utilization of U' - s curves is one way to characterize the shape of an image 
using its boundary. It is essentially a chain coded representation of the boundary. 0 
is the angle formed between the a reference line and the tangent to the curve, and s 
is the arc length as the boundary is traversed. It can be shown that staight lines in 
an image correspond to horizontal lines in the 0 - s curve and circles correspond to 
straight lines with slopes of The 0 - s curve for a closed boundary is periodic with 
a discontinuous jump from 2” to 0 as the curve is retraced to the starting point. 
Figures 2.4. 2.5. and 2.6 show the 0 - s curves for a square, circle, and equilateral 
triangle, respectively. 



CHAPTER 3 

PROBLEM STATEMENT 

The objective consists of determining the position of each link of a PUMA arm 
using fiducial marks. Each link should he uniquely identifiable regardless of the 
orientation of the arm providing that the link is in the field of view of the camera. 
The placement of multiple fiducial marks on the arm provides an excellent method 
to accomplish this task. The methodology behind utilizing multiple fiducial marks 
to label each link of the PUMA is task efficient and effective. The fiducial mark itself 
is a planar object of specified dimensions. Because of the dimensional specification, 
the location of each affixed mark is known relative to the arm-centered coordinate 
system. This reduces the original objective to one of distinquishing fiducial marks 
and locating an associated point. 

When the marks are viewed by the CCD camera all the points on the fiducial 
marks are subjecied to a perspective projective transformation that maps the three 
dimensional coordinates of the mark into two dimensional points in the image plane. 
This mapping is a notiinvertible transformation. Therefore, any point in the image 
plane can correspond to an infinite number of points, in the arm-centered coordinate 
system that lie upon the line connecting the image point to the focus of the camera. 
However, it is possible to locate I he arm-centered coordinates of the mark by utilizing 
two calibrated cameras. If an algorithm is employed to locate a particular point 
in both image planes the two locations can be used in a triangulation algorithm 
to identify the location of the point in ih<- arm-centered coordinate system. The 
triangulation algorithm is straightforward. Therefore, it will not be addressed any 
further. 

If the arm-centered coordinate system is aligned with the image coordinate 
system, the perspective transformation would take the form given in Chapter Two. 
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For the two coordinate systems to he aligned, it is necessaiy to align the x-axis 
and y-axis of the arm-centered system with the u-axis and v-axis of the image, 
respectively. Since this is generally not the case, it is necessary to pre-multiply 
the arm-centered coordinates by a composite rotation and translation matrix that 
transforms the original coordinates to coordinates relative to a three-dimensional 
coordinate system that has its origin aligned with the image plane origin, its x-axis 
aligned with the u-axis, its y-axis aligned with the v-axis, and its z-axis aligned 
with the optical axis. Once the linear transformation is achieved the perspective 
transformation given in Chapter Two is valid. Since the cameras are calibrated 
and the transformation is known, obtaining the coordinates of the point in the 
arm-centered coordinate system is accomplished by post-mult plying the coordinates 
obtained via triangulation with the inverse of the transformation matrix. Therefore, 
it is evident that the recognition of the system of marks and their associated points 
determines the position and orientation of t he arm. The main emphasis of this work 
is the design and recognition ot the fiducial marks. 

The imaging ot an object using a CC 0 camera produces a substantial amount 
of distortion. Perspective transformation. c[uantization and sampling produce the 
most distortion, but the pincushion and barrel effect also contribute to the degrada- 
tion of the object representation. As a consequence, it is necessary to design marks 
and feature extraction algorithms that are insensitive to these effects. Perspective 
distortion of a mark occurs when there are points in the mark that have different 
optical axis coordinate values. It is essentially the converging railroad effect. It can 
transform squares into trapezoids and circles into distorted ellipses. Because it is 
proportional to the inverse ot the distance along the optical axis, it is difficult to 
account for without an approximate knowledge of the position and orientation of 
the mark. When all the points within a mark lie in a plane perpendicular to the 
optical axis only a scale change results. I herefore, if successive images are taken 
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of a mark that is only translated in the direction parallel to the optical axis, the 
images only differ by a scale fac tor. 

Representation of a continuous object by a finite number of pixels inherently 
produces an inaccurate representation. As the ratio of the image size to pixel size 
decreases, the image distortion increases. If the ratio of image size to pixel size 
becomes too small, the image becomes unrecognizable. This effect is similiar to the 
aliasing effect for one-dimens ion al periodic signals. - 

For the particular robotic system employed in this work, the mark will be no 
greater than two meters away from the camera. The resolution of the frame grabber 
is 512x480 pixels and the field of view is approximately two square meters. This 
produces a pixel resolution of approximai ely two millimeters when the object is 
two meters away from the image plane. This implies that the marks should be as 
large as possible to compensate for the large pixel size at that distance, but there 
is a limitation on the size of the pixels. This limitation is caused by the link size. 
Each link has six sides and at least four of these sides can be used to affix a mark. 
The smallest side of a link is approximately 3.5 inches. Therefore, this is the upper 
bound of the size of the fiducial mark. 



CHAPTER 4 

FIDUCIAL MARK IDENTIFICATION 

The problem consists of generat ing a sufficient number of fiducial marks to label the 
robot arm regardless ot its position and orientation in space and the background it 
is placed upon. 

4.1 Design of the Fiducial Marks 

It was decided that marks based upon simple geometric figures might aid in 
the identification process. In any recognition process involving several patterns it is 
necessary to extract a set of features that when utilized in a decision making function 
will yield a unique value for each of the patterns. If this criterion is not satisfied 
then two or more ol the patterns cannot be distinquished. For this reason the 
circle, square, and equilateral triangle seemed like excellent candidates for fiducial 
marks. They possess features readily extracted and uniquely determined. Some of 
the features that can be extracted are moment invariants, contour signatures, and 
compactness measures. 

Another criterion that needs to be addressed is the size restriction of the marks. 

The marks are restricted to a 3.5 inch box. This restriction exists due to the fact 
the marks have to be placed on the arm. l .ach mark must fit on each face of every 
link. 

Another criterion that needs to be addressed is the quantity of unique marks 
used to label the links of the arm. Tire three basic shapes must generate at least 
twenty-seven unique composite shapes to accomplish the labeling task. This can be 
accomplished by nesting shapes within other shapes. The sizes of the shapes are 
chosen to maximize the size of the inner shape while insuring that there is at least 
a three pixel wide border separating the shapes. The inner shape size is maximized 
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to minimize the distortion due to digitizing. As the ratio of the shape size to pixel 
size is decreased the shapes become increasingly more difficult to distinquish. All 
the shapes designed consist of a. black shape within a white shape within a black 
shape within a white rectangular border. This configuration generates twenty-seven 
unique composite figures. 

4.2 Segmentation of the Marks from the Background 

For the algorithm to succeed, it is necessary to segment the mark from its 
surroundings. The placement of the shapes within a white rectangular region enables 
the composite patterns to remain intact. If the outer white rectangular region is 
not present and the marks are placed “upon a black background, the outer figure 
might be unrecoverable in the image. It is possible for the outer white rectangular 
region to be distorted by its background: but this is of no consequence because the 
algorithm only searches for a white border and doesn’t try to classify the shape. 
This algorithm is extremely efficient for backgrounds with a relatively small number 
of white regions. After a white region is located, the algorithm searches the inner 
region to determine the presence of a mark. 

4.3 Extracting the Outer, Inner, and Middle Borders 

Extraction of each of the borders is vitally important in obtaining a reliable 
feature space. If an error is produced in the border extraction process, the subse- 
quent feature space calculations will yield inaccurate results. In general, the method 
chosen to extract the border depends upon the border definition. For a continuous 
image a boundary point is usually defined as follows 

Definition 1 A boundary point of a sd n a point having the property that every 
neighborhood of it contain s points in the sit and points not in the set. 
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This definition usually refers to the set of points within some connected region and 
the set of points outside the region, where each region consists of an infinite number 
of points, and the neighborhood of each point is infinite. For a discrete image every 
region contains an finite set of points or pixels and each neighborhood consists of at 
most eight pixels. Because of this distinction, it is necessary to modify the above 
definition. Before a definition of the border pixel for a discrete image is given, it is 
necessary to define two related terms. These terms are the four-neighbors and the 
eight-neighbors of a pixel. 

Definition 2 Given a pixel P at coordinates (x,y), the four-neighbors of the pixel 
are given bg the pixels with the coordinate > (x-l,y), (x,y-t-l), (x-f-l,y) , and (x ,y-l). 

Definition 3 Giren ci pixel P <it coordinates (x,y), the eight-neighbors of the pixel 
are given by the four-neighbors of the pixel and the pixels with the additional coor- 
dinates (x-Ly-hl). (x+ l,y-h 1) ■ (x+l,y-l )■ and (x- 1 ,y- 1) . 

Now, the definition of a border pixel can proceed. 

Definition 4 .4 pixel P at coordinates (x.y) is a border pixel if and only if P has at 
least two eight-neighbors in the same set as P, and P does not have more than three 
four-neighbors in the same set as P, win n the set P contains all of the the pixels 
that have intensit ies that are allowed to la connected. 

For binary images the images are divided into two sets, pixels with a value of one 
and pixels with a value of zero. 

To extract the boundary of each shape, the algorithm searches for a boundary 
candidate. Once a candidate is found it searches the eight-neighbors of this pixel in 
a clockwise fashion tor another border candidate. It another candidate is found it 
searches the eight -neighbors of this pixel. It continues this search until it reaches the 
first border element. If at any point it cannot find another border pixel, it returns 
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to the previous pixel and searches the remaining eight-neighbors. If the algorithm 
backtracks to the starting point and cannot find any border elements among the 
remaining eight-neighbors it ret urns and moves on to the other starting points. If 
no borders are found in the aclmissable region a failed flag is returned. 

Once the border from an enclosing shape is obtained, the algorithm restricts its 
search region to the region enclosed by the border and searches for border elements 
that belong to the opposite intensity set of the enclosing border. 

4.4 Moments as Feature Parameters 

Moments and functions of moments have been utilized as pattern features in 
a number of applications involving the recognition of planar objects. Functions of 
moments can be utilized to obtain features which are invariant to scale, rotation 
and translation. They are considered to be reliable features if they are insensitive 
to image degrading effects such as quantization, and sampling. Moments are global 
descriptors which characterize the distribution of the points of an image. One of the 
major drawbacks of using moments is the large number of multiplications involved 
in the computational process. The straight forward method of calculating moments 
requires 10MN multiplications for an M x N image. Unless the system has a dedicated 
math coprocessor, the extraction of these features in real time is infeasible. However, 
there have been several fast algorithms devised for this problem. 

4.4.1 Moments of a Generalized Rectangle 

Consider the generalized rectangle represented by Figure 4.1 where a x is the 
width, T(«i ) is the length, and T is the arbitrary scale factor. Consider the region 
in the u-v plane represented by figure 4.2 

The central moments of the region are denoted as follows: 

/ . • v p y 1 dxcly 
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(4.1) 




34 


l l M 


rai 

= /j— v* 


(4.2) 


“m = [ 


r P+l _ w 9+ 1 

'-r.Jt— 


P+1 ' !l,l <?+ 1 


-ai J 


(.Ipq — 


r (r fll )' +i _ ~^ +1 | 


p+ 


p + 1 <7 + 1 7 + 1 


(4.3) 


(4.4) 


1 


/+7 ~ 


(/»+ 1)(<7 + 1) 


[(Tr/i^d -(-l) p+1 )](a? +1 (l -(-1) 9+1 )] (4.5) 


^P9 — 


[T< p+l > 9 j p+ ' ,+ i, ][( 1 - (-l) p+1 )(l - (-1) ,+I )] (4.6) 


(p+D(q + 1) ! 

It can be readily seen that it either p or q is odd then p P9 = 0. If p and q is 
even then p ?(? is given by 
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To calculate N p , r the scale normalize.! central moments, the following is used 
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(4.8) 


Using the results of equation 4.8 
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The first rotationally invariant moment, is given by 


<Pl = 1>U + V02, 


(4.12) 


where q 2 o is 



and 7702 is 


(4.13) 


As a result. 
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(4.14) 
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For the special case of a square, T=l , 


<P\ = 1/6. 


For the case where T=2. 


Oi = -V2+ 


the second rotationally in \ a riant moment 02 is given by 
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It is clear from equations 4.15 and 4.16 that the moment function <f>\-<f >2 is invariant 


for all rectangles. 



36 


4.4.2 Moment Calculations for a Generalized Ellipse 

The calculation of the moments of a generalized ellipse is sufficient to obtain 
the moments of a circle. It was decided to extract these features. This extraction 
may prove to be useful in the analysis of the effects of the distortion of a circle 
introduced by orthogonal projec tion. If the ellipse is represented by Figure 4.3, the 
central moments of the region under the change of variables 

x = 7V, cos 9 


and 


is denoted bv 


y = ?•, sin 0 


/v , = j " ' (T r cos 0) p [r sin 6) q Trdrd9. 

Grouping r 5 s and extracting the scale factor produces 

Hn = (F ,+ 1 ) j£ £ r° ,+,+1) cos p sin ? 8drd9 
The first iterated integration produces 

[ cos p sin q OdO 
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Jip+ 1 j-2 : 1 

{p + q + 


The evaluation of the second iterated integral produces 

(( , _ !)(,, _:i|...3 .1 (,-!)■■■ 3-1 

""" (j> + </+2) l (/ > + «'(/' + ?- : ■ • ■ • (p + 2) Jl 9 • • • 4 ■ 2 

for p and q even, and zero otherwise. 

Substitution of p=2 and q=0 into equation 4. IT yields 
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(4.17) 


Substitution of p=0 and q=2 into equation 4.17 yields 

/'■02 — 
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Calculation of the second Older scale invariant moments produces 
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The first and second rotation, scale and translation invariant moments are 
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It is clear from equations 4.18 and 4.19 thai the moment function 4>i~4>2 is invariant 
for all ellipses. 


4.4.3 Moment Calculations for an Equilateral Triangle 

Given the equilateral triangle in Figure 4.4 with side length 2a, the central 
moments are defined as follows: 
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Performing the first integration yields 
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Using the binomial expansion. 
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Performing the second integration yields 
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substituting for v. 
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4 . 4.4 Moments of orthogonally projected shapes 


f 2 fl, ‘([^] p+ 7 + 2 - , -[ F ^r ,+2 - 1 ) 


In many situations, it becomes possible to approximate the perspective pro- 
jective transformation of an imaging device' by an orthogonal transformation. One 
such situation is the case when the variation in object point distances is negligible 
with respect to the object plane to image plane distance. In this case the distance 
of the object along the optical axis may be considered fixed. For example, since 


1 r o 
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and f and z c are fixed, the transformation between object and image points is equiv- 


alent to a scale change. 


and 


XI — h\l' 0 


l). = >'l)o 


where k is the constant -^z] ■ This is the orthograpic projection model, and it 


facilitates the development of moment functions for the basic shapes. 
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Consider the situation where one of the basic shapes is arbitrarily rotated 
about the object centered coordinate system. This can be represented by the mul- 
tiplication ot the composite rotation matrix R with each object point \x/ 0 yi 0 
where R is given by 


Hi 

7*12 

r 13 

r-2\ 

v n 

^23 

r 3\ 

>32 

r 33 


and r^j is a trigonometric function of the rotation angles a and <j> given in Chapter 
Two. Under this rotation and orthogonal projection, 

•r, = k(r u .rC + I'nll'o + T x ) 

Vi = k(r 2 l.r/,) t >'22!J , q + T y ) 

Given the endpoints of two parallel line segments in the object coordinate 
system denoted by for line l 

(a-, .i/i)(x'i + A.r,ji + Ay) 

and for line 2 

(x->. y> )( r i + Ax, i /2 + Ax). 

Now if both line segments are subjoined to the same rotation and orthogonal 
projection, the points of line 1 transform into 

k(r u .v i + + r..i’ 2 X .v i + r 22 ?/i + T y ) (4-20) 

and 

k{r u {.Vi + Nc) -f- /’i >( //i + A//) 4- 7j . r 2 if.i’i + 5x) + r 22 (i/i + Ay) 4- T y ), 
and those of line 2 transform into 

k(r\].r> + I'liUi + T • r 2\^2 + r 22 ! j2 + Ty) (4.21) 
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k(r n (. r 2 + Ax) + r n (f/2 + Ay) + T x ). r n (x 2 + Ax) + r 22 (y 2 + Ay) + T y ) 

From equations 4.20 and 4.21 it is clear that the slopes of both transformed line 
segments are equivalent and the slope is given by 

>'22 Ay + r 2 iAx 
'’21 A// + r u Ax 

The line segment lengths are given by 


Ax + r 22 A//) 2 4- (r 21 Ai/ + r n Ax) 2 

Since under rotation and orthogonal projection, parallel line segments remain 
parallel, it is evident that a square under this transformation must be converted to 
a rectangle. As a result, the moments of an orthogonally projected square are given 
by those of the generalized rectangle, where 


V'~T i + r 21 

\J r \l + r 22 


(4.22) 


Similarly, rotation and orthogonal projection of a circle produces an ellipse 
and T is given by equation 4.22. 


4»4«5 Moment approximations due to digitization 


For digital images, the double integration used to define the moments is typi- 


cally approximated by double summations and the moments are denoted as 

M .V 

»y, = E E ■V l ’y 1 f{x, y) 


(4.23) 


v, = £ EE r - r ' r ' a - y) q fl x i’j ), 


(4.24) 


where M and N are the image dimensions. 
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Figure 4.5: Curvature plot of an equilateral triangle 

Inherent in this approximation is the loss of strict rotational and scale invari- 
ance. For a square it is shown by Teh and ( 'hin [11], that the first invariant moment 
is 

1 - 1 /a 2 
6 

where a is the ratio ol the square size to the pixel size. It is readily observed that <t>\ 
is no longer scale invariant, but depends on the size ot the sampling giid. Tne loss 
of rotational and scale invariance arises because the sampling grid is not adequate 
to represent the shapes, and as a result changes in orientation result in a changed 
representation ot the object, in the image plane. 

4.5 Curvature as a feature parameter 

Since most it not all ot the shape iulormation of an image is contained in its 
boundary. The use of features based on the c - s curves of an image boundary seems 
like an exceptable method to obtain unique features. One feature that might yield 
promising results is the curvature of the boundary. 

The curvature ot a boundary is defined as the rate of change of ip with respect 
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to the arc length, ancl it can be easily obtained from the representation of the ip - s 
curve of the boundary ot an image. Analyzing the curvature plots in Figures 4.5, 
4.6, and 4.7 it is readily observed that there are exactly three jump discontinuities 
for the triangle, four discontinuities for the square, and no discontinuities for the 
circle, and m general anv quadrilateral will have four discontinuities, any triangle 
will have three discontinuities, and anv* ellipse will have no discontinuities. This 
information can be used to discriminate lx ‘tween the three shapes even if they are 
perspecti velv distorted. It the boundary of an image is traced and the jumps in 
curvature are counted, the shape will be determined. This formulation is based 
on the assumption of a continuous image'. Since the image is not continuous, a 
modification has to be devised. 

For a discrete image, the typical v - s curve can be represented as an eight 
directional chain code of the angles. This decreases the feasible angle space of an 
image from infinity (in the continuous case) to eight, and As will either be 1 or \/2. 
As a result of the sampling, straight lines in the actual object will not correspond 
to straight lines in the chain coded version of the image boundary. The chain code 
essentiallv links angles that torm an approximation of the slope of the line if the 
angles are averaged. It n is the number ot link angles averaged, the approximation 
will be within 

± arctau(l/n) 

of the act ua 1 slope ot the line. L -a ng the d 1 1 1 eren ce of the av ei age of the links on both 
sides of a particular chain code member will yield an approximation of the external 
angle of the shape'. A threshold can be used to determine which values correspond 
to a significant angle change. \\ ith a proper selection of the thieshold the square 
and the triangle can be discriminated. Tin' square should have four significant angle 
changes and the triangle .should have three significant angle changes. The circle is 
identified using a slightly different approach. A digital lepresentation of a circle will 
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contain on average angle changes ot less than <r/2 radians and both the square and 
triangle will contain at least, one exterior angle that is greater than or equivalent to 
7 r / 2 radians. Therefore, counting the number of angles greater than or equal to t/ 2 
will determine if the object is a circle. This is assuming of course that the number 
of pixels averaged does not exceed 1/8 of the chain code. Of course if the sampling 
grid is sufficiently coarse it will be impossible to distinquish any of the shapes. 



CHAPTER 5 

EXPERIMENTAL SECTION 


In the sections that follow ci labeling convent ion for the fiducial marks will be utilized. 

Each mark is labeled with a three letter code, where the first letter of the code is 
the first letter in the name of the outer shape, the second letter of the code is the 
first letter in the name of the middle shape, and the third letter of the code is the 
first letter in the name ot the inner shape. For example, the code cct corresponds 
to the mark that contains a circle as the outer and middle shapes and a triangle as 
the inner shape. . . . . 

5.1 Simulation Results 

Digital representations ol a square, a. circle, and an equilateral triangle are 
created. The shapes are represented initially by a finite set of points. For example, 
a square would be represented by its four vertices. In the first series of tests, the 
points are used to obtain a digital representation of the shape, and features are 
extracted. In particular the moment invariants and the external angles changes 
are extracted to verify the analytical results. To simulate the results of orthogonal 
projection, the points are rotated m space and appiopiiatelv ttansformed. The 
digital representation is obtained, and the features are extracted. This is done to 
observe the feature changes under these types ot transformations. What follows is 
the data obtained from these simulations. 

The first set of simulations consist of rotating the square and equilateral tri- 
angle in the image plane from 0 to 180 degrees in increments of 9 degrees. The 
invariant moments are extracted at every orientation. This is repeated for different 
scale factors (ratio of image size to pixel >i ze). The scale factors range from 3 to 20. 

The results for scale factors of T 10, and 20 are shown in Tables 5.1 and 5.2. Since 

•IT 
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Scale 

Maximum 

Minimum 

mean 

standard deviation 

3 

10 

20 

0.187500 
0.166600 
0. 166875 

0. 152788 
0.164931 
0.165650 

0. 1836400 
0.1656439 
0.1663562 

0.01091 
5.98 xlO" 4 
4.34xl0~ 4 


Table 5.1: First invariant moment of a rotated and scaled square 


Scale 

Maximum 

Minimum 

mean 

standard deviation 

‘3 

0.2287390 

0.173416 

0.19 15400 

0.01763 

10 

0.197430 

0.187741 

0. 19279755 

2.45 xlO" 3 

20 

0.193099 

0.191600 

0 19251845 

3.60x10-* 


Table 5.2: First invariant moment of a rotated and scaled equilateral 

triangle 

there is no change in the digital representation of a circle undergoing rotation in 
the image plane, it- is sufficient to extract t lie invariant moments once for each scale 
factor. The results for scale factors of 3. 5. 10. 15, and 20 are shown in Table 5.3 
It is clear from Tables 5.1. 5.2, and 5..; that the moment invariants for a digital 
image are not strictly invariant, but tor tin* larger scale factors there is a negligible 
deviation between the theoretical values and the ones obtained via simulation. For 
example with a scale factor of 2l>, the percentage error between the theoretical value 
and the experimental value of o x for tin* square, circle, and equilateral triangle 


Scale 

'■"'I 

3 

.158750 

0 

. 1 59024 

10 

.159120 

15 

.159136 

20 

.159143 


Table 5.3: First invariant moment of a scaled circle 
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<f> (degrees) 

a (degrees) 

<2>l 

(3 o 

(P 3 

27 

9 

0.167588 

0.000502 

0.000000 

27 

27 

0.170915 

0.001510 

0.000000 

45 

45 

0.209200 

0.016066 

0.000000 


Table 5.4: Invariant moments of a rotated and orthogonally projected 

square 


(f> (degrees) 

a (degrees) 


<p 2 

4> 3 

27 

9 

0.200274 

0.009220 

0.004566 ' 

27 

27 

0.2 11427 

0.02765 

0.011028 

45 

45 

0.2 15370 

0.031400 

0.005398 


Table 5.5: Invariant moments of a rotated and orthogonally projected 
equilateral triangle 

are 0,183, 0.0075. and 0.09 , respectively. It is also clear from the data that the 
first invariant moments are sufficient to distinguish the shapes obtained via this 
simulation. 

The next series of tests simulate rotation about the x-axis of the object- 
centered coordinate system by an angle ol m and about the y-axis of this system by 
an angle <p. where' the values of o and o vary from 9 to 45 degrees in increments of 
9 degrees. After rotation, the shapes are orthogonally projected, and the invariant 
moments are extracted. Fables 5.4 and *>.•» show some of the results. Incidentally, 
because of the similarity between the invariant moments of the circle and the square, 
this table was omitted. 

It is clear from Tables 5.5 and 5.-1 j hat merely using 6\ as a decision criteria 
yields incorrect results because there are occasions when the value of <p\ for the 
orthogonallv projected shapes overlap. Bui the invariant moments can still be used 
to discriminate between the three shapes. For example from Tables 5.o and 5.4, 
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it is readily observed that the third invariant moment is less than 10 -b for the 
square and is greater than 10~ 5 for the equilateral triangle. Similiarily for the circle, 
the third invariant moment is less than I0~ 6 , This test can be used to uniquely 
identify the equilateral triangle. To differentiate between the circle and the square, 
the orthogonal invariant derived in Chapter Four can be utilized. This invariant is 
given by — For a continuous square t he orthogonal invariant is equivalent to jrj- , 
o,jid for a continuous circle it is equivalent t.o ^ * Foi the discrete representations 
of the circle and square, the deviation from these values are a function of the scale 
factor. For a scale factor of thirty, the percentage error obtained from the calculation 
of the orthogonal invariant lor the discrete circle and square is less than 0.3 percent, 
and for a scale factor of twenty, the percentage error is less than 0.8 percent. As a 
result, The discrimination of the three digitally represented shapes under orthogonal 
projection can be obtained. 

The simulations that, are performed for the moments are repeated using the 
features based on the ip-s data, with the notable exception that the scale factor was 
never reduced below 10. The algorithm performed well. Circles and squares are 
always correctly classified, and t he triangles are misclassified once. 

5.2 Experimental Results 

A Javelin C'C'D camera and Data Cube frame grabber are utilized to obtain 
grev level images of twenty-seven fiducial marks. The marks are placed at various 
orientations within the field ol view, these images are subsequently thresholded 
to obtain binarv images. The borders ul each shape within the fiducial marks 
are extracted. The centroid is obtained from the border of the outer shape and 
the moment invariants and the chain coded representation ot the ip - s curve are 
extracted from each ot the borders. 
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O i 

0-2 ^ 

<?3 

Outer shape 
Middle shape 
Inner shape 

0.261084 

0.214408 

0.326525 

0.030719 

0.004788 

0.022299 

0.008405 

0.000631 

0.005739 


Table 5.6: Invariant moment calculations of fiducial mark ttc 

The rationale behind using the moments as features is to obtain an uncom- 
plicated method to recognize the shapes. This is the reason that the approxima- 
tion of the perspective transformation by an orthogonal projection is utilized, the 
transformation properties of tin* moments based on this simplification would make 
recognition ot the shapes relatively simple. Tables 5.6, 5./, and 5.8 contain the 
moment invariants of a tew typical fiducial mark images. From these tables, it 
can be clearly observed that- the calculated moment invariants do not possess the 
convenient properties associated with the moments of the orthogonally projected 
shapes. This observation leads to conclusion that to extract reliable features from 
the moments is necessary to utilize another methodology. One methodology is to 
prescribe values tor some ol the higher order moments (higher than older two) in an 
attempt, to compensate tor the effects ol the perspective transformation. This does 
not completely compensate tor the effect >. but better results are obtained. The ma- 
jor problem with this approach is it becomes necessary to solve nonlinear equations 
to obtain the values necessary to fix some of the higher order moments. This added 
complication diminishes the value ot usinu the moments as features. As a result, 
it is decided to focus on the data obtained from the ip - s curves to obtain reliable 
features. 

The features utilized to identify the marks are based on the 0-s curves of 
the boundary of each shape. These prove to be more reliable than the moments 
invariants. Two experiments are. .devised. In the first experiment, The outer shapes 
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©i 

- ^2 

<t>3 

Outer shape 
Middle shape 
Inner shape 

0.193425 
0.22 985) 1 
0.357245 

0.000019 

0.011057 

0.018707 

0.000071 
0.000982 
0.056625 : 


Table 5.7: Invariant moment calculations of fiducial mark tst 




d>2 

4 > 3 

Outer shape 
Middle shape 
Inner shape 

0.182459 

0.193400 

0.310022 

0.000161 

0.009736 

0.025384 

0.000064 
0.000233 ■ 
0.018699 


Table 5.8: Invariant moment calculations of fiducial mark esc 


are identified 1 00 percent of the t ime, the middle shapes are identified 96.3 percent of 
the time, and the inner shapes are never identified. The algorithm fails when trying 
to classify the inner shapes because there is an implicit assumption that the length 
of a side ol a shape is at least twice as large as the sample size utilized in the chain 
code averaging scheme, and since for boundary lengths of approximately twenty or 
less the assumption is not correct, the algorithm could not compensate. A second 
experiment is designed to correctly ideuiilV shapes with relatively small boundary 
lengths. In this experiment, the squares and triangles are discriminated 100 percent 
of the time, but circles are misdassified. The circles are classified as squares in 7 
out of 9 attempts, and the remaining times they are classified as triangles. 

Due to poor image quality, there are two occasions when the extraction of a 
useful middle border is not possible. These cases correspond to the marks ttc’ and 
’stc ! . These images are shown in Figures 5.1 and 5.2. Examples of images where 
the marks are identified are shown in Figures 5.3, 5.4, 5.6 and 5.5. The case where 
the middle shape is not correctly identified is shown in Figure 5.7. From this figure, 
the reason the algorithm misdassified the square as a triangle is clear. There is 
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a significant amount of curvature on one side of the square, and since the external 
angle change at the intersection of the curved segment and the line segment does n#t 
exceed the preset threshold, it is not interpreted as the beginning of a new side. This 
problem can be eliminated by adjusting the algorithm’s threshold. Examples where 
the algorithm misclassified the inner circles as squares are shown in Figures 5.8 and 
5.2. It can be seen from these figures that there exist some chain averaged external 
angle changes that exceed the preset angle threshold. This causes the algorithm to 
attempt to classify the shape as either a square or a triangle and it classifies the 
shape as being closer to a square than a I ri angle. The algorithm misclassifies the 
inner circle of fiducial mark 'tec' in Figure 5.9 as a triangle. This occurs , using the 
same reasoning as previously mentioned, because the algorithm classifies the shape 
as being closer to a triangle than a square. 
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CHAPTER 6 

CONCLUSION and FUTURE WORK 


A method is proposed to identity the position and orientation of the links of the 
PUMA 560. To accomplish this task several fiducial marks will be placed on each 
link. The goal is to uniquely identify and locate each fiducial mark. Two differ- 
ent recognition a-lgorithms are employed m an effort to ascertain the most reliable 
method to identity the marks. 

The first method utilized moment invariants and the second method utilized 
the chain coded version ot the i. *-s curve, l or this application, the algorithm based 
on theU-s curve outperforms the algorithm based on the moment invariants. The 
latter algorithm does not prove to be very robust. The moment algorithm is not able 
to compensate tor the perspective distortion present in the imaged representations 
of the shapes. The features based on the r -s curve performed reasonably well, but 
the algorithm can be improved. 

To find the limitations of the current fiducial mark recognition algorithm, more 
images will be examined. This can be accomplished by taking images of the fiducial 
marks at manv different angles ot inclination with respect to the image plane while 
constantly varying the distance to the image plane. These images would be used 
to determine which positions and orientation of the marks cause the algorithm to 
function poorly. The careful analysis ot these results add insight into the search for 
more robust and efficient algorithms. The tests should encompass the full range of 
the PUMA 560. 

Utilizing the data from the C’-s curve ot the shape boundary, it is possible 
to obtain more reliable recognition ot the marks. Instead of only utilizing the sig- 
nificant angle changes, the entire U’-s curve can be utilized by segmenting it into 
straight lines. These straight line segments provide valuable information regarding 
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the curvature of the boundary, using this methodology, the percentage of the curved 
portion of the boundary is obtained. This provides an efficient method to discrim- 
inate the circle from the other shapes. For example if 70 percent of a boundary is 
sufficiently curved, the shape is probably a circle. To distinquish between a rectangle 
and triangle, one can keep track of the positions in the boundary that correspond 
to the end points of line segments in the tL>- s curve. This information is used to 
determine whether a particular line segment in the curve corresponds to a significant 
portion of the boundary. If it does not then the segment can be interpreted as the 
result of a poor image. 

The improvements in the current algorithm and the verification of the corre- 
spondence of the centroid in different views of the same scene are topics that will 
be explored. 
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